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Abstract — Network coding (NC), in principle, is a Layer-3 
innovation that improves network throughput in wired networks 
for multicast/broadcast scenarios. Due to the fundamental dif- 
ferences between wired and wireless networks, extending NC to 
wireless networks generates several new and significant practical 
challenges. Two-way information exchange (both symmetric and 
asymmetric) between a pair of 802.11 sources/sinks using an 
intermediate relay node is a canonical scenario for evaluating the 
effectiveness of Wireless Network Coding (WNC) in a practical 
setting. Our primary objective in this work is to suggest pragmatic 
and novel modifications at the MAC and PHY layers of the 802.11 
protocol stack on a Software Radio (SORA) platform to support 
WNC and obtain achievable throughput estimates via lab-scale 
experiments. Our results show that network coding (at the MAC 
or PHY layer) increases system throughput-typically by 20—30%. 



I. Introduction 

The exponential growth of multimedia applications has 
resulted in current 3G cellular networks reaching (and ex- 
ceeding, in the near future) available network capacities. As 
a result, communication engineers must find newer ways to 
continue to increase aggregate throughput while preserving 
Quality of Service (QoS). Cross-layer approaches that seek 
to optimize aggregate network throughput based on adapting 
parameters from the physical or MAC layers have proven 
to be effective in this regard. Recently, Network Coding 
(NC) has attracted researchers' attention as another promising 
innovation in this context. 

A. Network Coding in Wired Networks 

Network coding was initially proposed as a distributed 
mechanism for achieving the multicast theoretic (max-flow, 
min-cut) capacity in wired networks. In wired multicasting, 
information is sent from a set of source nodes to a set 
of destination nodes over a multihop network where the 
intermediate nodes merely forward their received packets via a 
pre-determined look-up table (routing). Ahlswede et al., in (l] 
suggested the innovative notion of coding on layer-3 packets 
instead of look-up forwarding on specific outgoing links, and 
showed that network throughput can be increased. 
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In a network employing NC, routers perform a (random) 
linear combination of incoming layer-3 packets and broadcast 
the result to all its neighbors. Randomized linear network 
coding schemes were shown to be sufficient in achieving the 
information theoretic max-flow, min-cut bound on network 
capacity [2 |. Necessary and sufficient conditions for the design 
of such random linear codes were provided by Koetter et al. 
[3 |. While the concept of NC was developed for the network 
(IP) layer, it has often been implemented in practice at higher 
layers, such as the transport or application layers [41, [51, |6|. 

A fundamental reason as to why network coding is benefi- 
cial is based on the premise of simultaneous transmission from 
several (source) nodes to a single (receive) node. While this is 
feasible in a wired network whereby concurrent transmissions 
are deemed 'orthogonal', a multi-hop wireless network is quite 
different. Wireless is a shared medium (at least for nodes 
within a common transmission range) and there is no natural 
spatial orthogonality. Thus wireless multihop networks have 
relied on other forms of orthogonality - in time (TDMA) or 
frequency (FDMA) - to achieve interference-free transmission. 
Wireless Network Coding (WNC) uses non-orthogonal trans- 
missions that, nevertheless, allow recovery of multiple packets 
to enhance aggregate network throughput. 

B. Network Coding in Wireless Networks 

The broadcast nature of wireless (coupled with network 
topology) determines the nature of interference. Simultaneous 
transmissions in a wireless network typically result in all of 
the packets being lost (i.e., collision). A wireless network 
therefore requires a scheduler (as part of the MAC func- 
tionality) to minimize such interference. Hence any gains 
from network coding are strongly impacted by the underlying 
scheduler and will deviate from the gains seen in wired 
networks [7|. Further, wireless links are typically half-duplex 
due to hardware constraints; i.e., a node can not simultaneously 
transmit and receive due to the lack of sufficient isolation 
between the two paths. 

Another important consideration is the impact of the wire- 
less channel on a transmitted signal - inclusive of channel 
attenuation which is assumed negligible on a wire, but may not 
be ignored in wireless. The received signal y over a wireless 
link can be modeled in general as 

y — hx + z, (1) 

where x is the transmitted symbol, z is the additive noise 
sample at the receiver, and h is the (narrowband) channel loss 



2 



between the source and the destination. Some previous work 
on WNC has incorporated aspects of the features mentioned 
above. Omnidirectional source transmissions were modeled 
in El, 191 as hyper- links with additional constraints that 
prevent nodes from transmitting and receiving packets simulta- 
neously. Interference effects were incorporated by |9| for joint 
optimization of MAC and network flows, where successful 
transmission between a node pair is based on a signal-to- 
interference-plus-noise-ratio (SINR) threshold, thereby poten- 
tially allowing simultaneous successful reception at different 
receive nodes. 

One of the potential applications of WNC is in multicasting. 
A decentralized formulation to throughput optimization for the 
multicasting problem was introduced in |10||11|. However, if 
additional objectives such as maximizing throughput subject 
to delay constraints are considered, then network codes must 
be jointly designed with MAC as in fT2l, fTTl. Authors in 
lfr4J qualify the impact of random access MAC schemes (such 
as CSMA/CA) on performance of NC in an all-to-all data 
dissemination system. 

Information exchange via wireless relays is another natural 
scenario for potential application of network coding on top of 
the MAC layer. In |15|, the authors assume a deterministic 
MAC protocol and generalizes the canonical three-node sce- 
nario to the case with an arbitrary number of relays between 
two source nodes. The main issue with MAC layer NC is that 
any existing asymmetry in the system may cause performance 
degradation. In fact, as we will show in our experimental 
result in Section IIVI even small asymmetry in source nodes 
power decreases system throughput. Hence, MAC layer NC 
is naturally suited for symmetric transmission rates between 
any node pair. Since NC at MAC layer directly operates 
pointwise on the information symbols in the two packets from 
the respective sources, equal size packets (hence equal rates) 
are necessary. 

The efforts to extend the idea of MAC layer NC to asym- 
metric traffic include llT6llfT7l . For example, ITTtI proposes 
to interpret NC as a mapping of modulation constellation to 
match symmetric traffic. However, it has been shown that, even 
in these schemes, performance is significantly reduced due to a 
lack of symmetry |18|, |19|. To deal with asymmetric traffic, 
implementing the XOR operation of NC at the antenna has 
been proposed; this has been dubbed as Analog Network Cod- 
ing (ANC) or Physical Layer Network Coding |20|. Recently, 
Chen et al. fTS) proposed a new network coding scheme called 
Decode-and-Forward with Joint Modulation (DF-JM). They 
show that DF-JM has the potential to achieve the capacity for 
asymmetric or symmetric traffic. Simulation-based evidence 
of the gains from WNC for a cellular network, in terms of 
achievable rate regions for MAC-layer and PHY-layer network 
coding, was presented in 1211 . ll22i . based on the assumption 
of perfectly known channels at every user. 

The above summary of wireless network coding literature 
suggests that little effort has been put forth into actually 
implementing WNC concepts in any lab-scale prototype and 
measuring performance in a real setting. One might say that, 
consequently, WNC is a largely unproven idea in practice; our 
work is thus distinguished by demonstrating some of the first 
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Fig. 1. A relay network. Nodes A and B want to exchange data through 
node R 
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Fig. 2. (a) 4-step and (b) 3-step information exchange using NC. In both 
scenarios, MAC protocol is TDMA. 



evidence of the effectiveness of WNC in a canonical relay 
network. 



C. Wireless Relay Networks: A Review 

In this paper we investigate the challenges of implementing 
WNC in a simple relay network using a currently available 
software-defined platform, and suggest potential solutions. We 
present measured system throughput and compare it with 
suitable analytical results for benchmarking. Consider a two- 
way (bi-directional) relay channel with two source/destination 
nodes (nodes A and B), and one intermediate relay node (node 
R), as depicted in Figure [T] By assumption, the relay node 
only forwards data and is not a source or destination. Nodes 
A and B are not within transmission range of each other, and 
they require the relay node to communicate. For a half-duplex 
system, a baseline strategy for A and B is to exchange data 
via TDMA scheduling through node R. In the first time slot, 
node A transmits data to R. In the next time slot, R relays the 
received data to B. In the third time slot, node B transmits data 
to R. Finally, in the fourth time slot, R sends B's information 
to A. As depicted in Figure |2}(a), two bits of information 
would thus be exchanged in four time slots. 

As already shown in |23| and summarized in Figure |2]-(b), it 
is possible to exchange two bits of information in three time 
slots by applying network coding in the MAC layer at the 
relay node, thereby achieving a 33% throughput improvement 
relative to pure TDMA. In the first time slot, A sends its data 
XA to R. In the second time slot, B transmits its information, 
xb to R. In the third time slot, node R broadcasts xa ® xb- 
Since both nodes know their own data, if they receive xa(Bxb 
they can extract their desired information. 

The above throughput enhancement requires an ideal 
TDMA MAC, which deterministically assigns a channel to 
each node in the network. However, one of most common 
wireless access networks (802.11) uses a random access 
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MAC protocol based on carrier sensing (CSMA/CA) for time- 
sharing of the common channel. Besides the obvious advan- 
tages of a distributed protocol, such random access schemes 
are efficient and fair at low average traffic loads as in several 
data applications. Hence in this work, we first explore the 
challenges of implementing WNC over a system with random 
MAC protocol. The measured MAC layer throughput with 
NC shows significant gains in system throughput, close to the 
analytical predictions. 

As discussed in the previous subsection, MAC layer network 
coding is based on symmetric traffic, whereas many data 
exchange scenarios are asymmetric by nature. Since the relay 
node can be placed at varying distances between both sources, 
the data rates transmitted by the relay node to both destinations 
are typically different. As mentioned, PHY layer NC has 
been proposed to deal with asymmetric traffic. Unfortunately, 
there is relatively little attention paid in the literature to 
system implementation and actual performance measurement 
of applying NC at PHY layer. This is the second contribution 
of this paper We extend the concept of Decode-and-Forward 
with Joint Modulation (DF-JM) developed in |18 | (state-of-art 
PHY layer NC) to the asymmetric relaying problem considered 
here. In other words, we implement DF-JM to an OFDM-based 
(similar to 802.11a) PHY layer on a suitable test-bed platform 
(described later) and measure actual PHY layer NC throughput 
in a real-world application-a first, to the best of the authors' 
knowledge. 

The current state-of-art of PHY NC treats the addition of 
two (synchronized) analog signals at the relay antenna as the 
XOR operation over the information symbols. However, a 
closer model to reality for such a superposition should include 
the respective channel gains, i.e., 

y = hixi + h2X2 + z, (2) 

where Xi is the symbol from source i, and hi is narrowband 
channel gain between source i and the relay. Note that, besides 
the requirement of synchronism, successful analog network 
coding also requires full knowledge of channel gains at the 
relay node. 

D. Contribution and Organization 

In summary, the main contribution of this paper is three- 
fold: a) to design and modify the CSMA-based MAC layer 
to support network coding, (b) to develop new ideas for PHY- 
NC for asymmetric traffic scenarios and apply the new ideas to 
OFMD-based PHY layer, and (c) to demonstrate the utility of 
WNC via laboratory-scale experimentation using commodity 
wireless radios, notably 802.11a/g. Since much of NC theory 
has traditionally presumed orthogonal time-scheduled MAC 
layer (such as TDMA), we believe that our results are some 
of the first to estimate the benefits from WNC for a CSMA/CA 
MAC. 

The experiments were conducted using the Microsoft's new 
Software Radio (SORA) platform |24| and the open source 
802.11a code. SORA is a fully programmable software radio 
platform based on general purpose multi-core processors in 
commodity PC architecture, developed by Microsoft Research 



Asia (MSRA). With SORA, developers can implement and 
experiment with high-speed wideband wireless protocols (like 
IEEE 802.11a/b) using commodity general-purpose PCs |[25|. 

The rest of the paper is organized as follows: In Section HH 
we discuss the challenges and possible solution of applying 
WNC both at MAC layer and at PHY layer of a two-way 
wireless relay network. Then in Section Hill we explain how 
to implement the proposed system in previous section in 
SORA platform running 802.11a. Experimental results about 
system performance and throughput are available at Section 
HVl Finally, the paper concludes with reflections on future 
work in Section [V] 

II. System Description 

We propose to apply wireless network coding - both at 
MAC and at PHY layer - to a bi-directional relay network, as 
shown in Figure[T] Our hardware platform, SORA, implements 
802.11a MAC and PHY layer First, we explain how to 
apply network coding on top of 802.11 MAC with minimum 
modifications. Then, we describe how the DF-JM scheme is 
implemented to support the new PHY layer NC concepts. 

A. 802.11 MAC Layer Wireless NC 

The achievement of a 33% improvement in idealized net- 
work throughput (displayed in Figure|2l) for the canonical two- 
node scenario communicating via a relay, is attained based on 
some key assumptions, notably: a) a scheduled MAC, such as 
TDMA, with implicit node synchronization, and b) symmetric, 
constant-rate traffic, whereby source nodes have data to send 
in every slot. In the next two subsections, we describe an 
implementation for the SORA with 802.11, which bypasses 
both of the above constraints. To the best of our knowledge, 
there has been no demonstration of the benefits of NC in 
an 802.11 infrastructure network, where the Access Points 
act as natural relay between sources and sinks. The known 
short-term unfairness due to carrier sense multiple access type 
MAC protocols such as 802.11 provides new challenges in 
implementing NC, and will be explored further ||26]| . 

1) Relay Node: The relay node R (access point in 802.11) 
receives packets from both sources, implements NC operation, 
and broadcasts the result. Due to CSMA/CA channel access, 
the relay may receive many packets from one source before it 
gets any packets from the other. This necessitates two queues 
at the relay - one each for packets from sources A and B, 
respectively. Whenever the relay receives a packet from one 
of the sources, it looks at the other queue. If it finds the other 
queue empty, it queues the packet and waits for packets from 
the other source to arrive. The relay keeps queuing packets 
from a source node until it receives one from the other source. 
Then it XORs one of the queued packets from the first source 
and the (unique) proper packet from the other, and broadcasts 
the result (Figure O. Roughly speaking, a packet from source 
A (B) is called a proper packet of the packet from source B 
(A), if they have the same reception order (Figure |4]l. Later, in 
Section IIII-AI we discuss the choice of such a proper packet 
in more detail. 
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Fig. 4. Simple illustration of proper packets 



The queue size at the relay node should be sufficiently large 
enough to achieve an acceptable packet drop rate. Further, 
when the relay node receives a packet, it has to search the 
queue to find a proper packet to be combined with the received 
one. Accordingly, since searching a large queue is time and 
energy consuming, the queue size should also be sufficiently 
small enough to meet any NC delay constraint. 

2) Source/Sink Nodes: Upon receiving a packet from the 
upper layer, the source node starts listening to the channel. 
Whenever it finds the channel empty, it captures the channel 
and starts transmitting. A copy of transmitted packet is saved 
in its buffer, since it is needed to decode information from 
other sources as a result of NC. Whenever it receives a packet 
from the relay node, it checks its destination address. If the 
packet was a broadcast packet, it fetches a proper packet from 
its buffer, calculates the XOR between the two, and sends the 
result to the upper layer. 



B. PHY Layer 

Upon receiving the broadcast messages from the relay node, 
each destination decodes its intended message with its own 
signal as side information. As such, the two-way relay can in 
general be regarded as three separate slots, i.e., two sources 



send to relay node in slots 1 and 2 resp. and the relay 
broadcasts with side information in slot 3. 

1) Slot 1 and Slot 2: In slot 1 ( slot 2), the information 
bits wa(wb) are encoded and bit-interleaved and applied 
to modulator, generating the symbols xa (xb) from the 
respective constellations A4a{.Mb), that is transmitted to 
the relay node. Since the transmit rate mainly depends on 
the constellation size, we consider BPSK modulation at both 
sources for the symmetric relaying. For asymmetric relaying, 
B transmits QPSK signals while A uses BPSK modulation or 
vice versa. During the first time slot, the signal received at the 
relay node is thus given by 
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where Yar and Zar are the received signal and the zero mean 
complex Gaussian noise of variances a\p of the relay node, 
respectively. Likewise, the received signal at the relay node 
during the second time slot is 
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zbr, 



(4) 



where hi, i = A,B, is the channel coefficient of the link 
between the source/destination i and the relay node, and 
reciprocal channel can be assumed. 

2) Slot 3: During the third time slot, we apply the DF-JM 
scheme in the relay node to broadcast the data. The relay node 
concatenates the decoded information wa and wb, into a new 
sequence wr — \wa wb], and then encodes and modulates 
the resulting sequence jointly, regardless of the sizes of the 
original messages. As an example, we consider the asymmetric 
traffic with BPSK and QPSK constellations. Thus, we have 
Wa = [fli] and wb — [&2,&i], where ai,&2 and bi denote the 
binary symbols; then wr = [ai, 62, The transmitted signal 
from R is x/j — A4r{wr) by using the constellation Mr. 
The corresponding received signals at stations A and B are 
respectively yA = y^hAXB+ZA and yB = ^/P^hBXB+ZB, 
where za and zb are zero mean, complex Gaussian noise of 
variances (t\ and cr^ . 

With the help of known redundant bits wa, node A can 
decode the desired bits wb using subset partitioning, i.e. 
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Likewise, B can perform detection from the subset of the 
constellation points based on the known sequence wb- This 
is explained by the example shown in Figure |5] using 8-PSK 
constellation at the relay node. The relay node R needs to 
forward wa — [ai] and wb = [62,^1] to both destinations, 
where we assume that ai — and [62 61] = [0 1] without 
loss of generality. By applying the DF-JM scheme, the relay 
node R combines them together as wji = 001 and transmits 
Xr, =(001)8Ps_ff using the 8-PSK constellation. Thanks to 
the fact that the destination A knows ai = 0, it only needs 
to consider the possible [&2 ^^i] from all the 8-PSK points 
for which the first bit is 0, as shown in Figure |5] Similarly, 
the estimated oi for destination B can be chosen from the 
points i001)8PSK and (101)8psif. These results implicates 
that signal detection can be performed from a subset of 
the constellation points instead of the entire constellation. 
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Fig. 5. Demapping at both destinations for DF-JM schemes based on optimal 8-PSK labeling 
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Fig. 6. Optimal constellation labeling for QPSK constellation at the relay 
node 



Therefore, we can see that for DF-JM scheme, the high 
level constellation is used in the relay node, but low level 
constellation can be used for de-mapping at sink nodes A 
and B by exploiting the side information. From Figure |5] we 
also can see that if the constellation labeling map is carefully 
chosen in the relay node, the intra-subset Euclidean distance 
(i.e., (001)sPSK and {101)spsk) can be greatly increased 
through the side information. It can be verified that such 
labeling for 8-PSK in Figure|5]is optimal; the optimal labeling 
for QPSK is shown in Figure |6] 

III. SORA Implementation 

The network stack of the legacy 802. 1 1 implementation in 
SORA is depicted in Figure [T3] and consists of three parts 
Eil: 

• PHY layer-this layer is OFDM-based and similar to 
802.11a PHY layer 

• MAC layer-this layer is simply a state machine modeling 
CSMA/CA, the core component of Distributed Commu- 
nication Function (DCF) in 802.11a/b/g. 

• Link layer (LL)-this layer is responsible for interfacing 
with TCP/IP layer When a node receives a MAC frame, 
this layer decides what to do with it. If the packet is 
addressed to this receiver, LL passes it to TCP/IP layer; 
otherwise, the packet is dropped. 

Next, we describe how this stack was changed to support MAC 
and PHY network coding as described above. 
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Fig. 7. Proposed 802.11 network stack for (a) source/sink (b) relay node(s) 
in SORA 



A. NC at MAC layer 

To apply network coding at the MAC layer, the legacy PHY 
and MAC layers implemented in SORA architecture remain 
untouched. All changes are applied to Link layer via addition 
of an NC layer, dedicated to NC operation. As mentioned in 
Section III-AI this layer is different for source/sink nodes and 
the relay node. However, they both need a network coding unit, 
which simply XORs data on its input ports. Figures |7}(a) and 
|7]-(b) show the new stack protocol for relay and source/sink 
nodes, respectively. 

7 ) Packet format: When a source receives a packet from 
the relay node, it knows that the packet is a result of XOR 
combination of one of its own packets and the desired packet 
transmitted from the other source. But it may already have 
sent a number of packets before receiving any from the relay. 
Hence it needs to know which one of its packets has been used 
for encoding at the relay node. To solve this problem, Chou 
et al. proposed the concept of 'generation' for a packet (271. 
Each packet contains a metadata field in its header representing 
its generation. Only packets from the same generation can 
be combined. Each source has a counter which shows its 
current generation number At the beginning, both sources 
reset their generation counter to zero. Whenever they transmit 
a packet, they insert the cuiTent generation number in the 
related field in a packet header and increment the counter 
by one. 

The relay node only combines two packets from two sources 
if they are from the same generation. If it receives a packet 
from source A (B) with generation number n, it first searches 
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Fig. 8. Field for generation number in paclcet lieader 



in the buffer dedicated to source B (A) looking for a packet 
with the same generation number n. At that point, one of the 
following two scenarios will ensue: 

• There is a packet with generation n. In this case the relay 
node will combine the two packets and broadcast the 
result. Then, it look at the buffer dedicated to A (B) 
and deletes any packet from generation n. 

• The buffer is empty or there is no packet from generation 
n - in this case, relay node will save the packet in the 
buffer dedicated to A (B). If there is already a packet 
from generation n in A (B) buffer, it will be rewritten 
by the new one. 

When a source receives a packet, it looks at its generation 
number. It fetches a packet with the same generation number 
from its buffer, fulfills network coding operation on the two 
packets, and sends the result to the upper layer. 

Note that the variable counter that keeps track of a 
packet's generation is simply a register in a finite field, which 
means that the value contained is bounded and will reset 
after a while. Therefore, when two packets from generation n 
are combined, there is a possibility that they actually belong 
to different generations. The probability of this occurring 
depends on the size of the counter and is minimized by 
a sufficiently large value of generation ID (or counter). On 
the other hand, adding generation ID to the packet header 
increases packet overhead, representing a tradeoff in choice 
of generation ID field size. 

2) Source/Sink Nodes: Each source/sink node has two ma- 
jor threads, one for receiving a packet (sink part) and one for 
transmitting (source part). To implement the MAC layer NC, 
the receiving thread must know about the packet flow in the 
sending thread, i.e. the receiving thread needs to know exactly 
which packets have been transmitted to complete decoding. 
For that reason, as discussed in Section Hi- Al there is a shared 
buffer with bounded length BUFFER_SIZE between die 
receiving thread and sending thread in each source node. 

When a packet is transmitted, the sending thread puts a 
copy of the packet in the buffer. If the buffer is full, it 
will overwrite the oldest packet in the buffer If the source 
receives a packet, the receiving thread looks at its generation 
ID and fetches a packet, if any, from the buffer with the 
same generation number from its buffer. Then the node XORs 
the two packets (received packet and one from the buffer) 
and sends the result to the upper layer If there is no packet 
with that generation ID, the receiving thread sends the packet 
to the upper layer. If the packet has undergone NC (XOR at 
the relay node), then the CRC check at the upper layer fails 
and the packet is dropped; only a non-NC directly from the 
other source is accepted. Note that since the buffer is shared 
between two threads, lock protection between the threads is 
needed [^3- 
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Fig. 9. Multitiireading and sliared queue configuration at tiie relay node 

3) Relay Node: The relay node has three major threads: 
the receiving thread for A, the receiving thread for B, and the 
sending thread (Figure |9]l. Two receiving threads share two 
buffers as depicted in Figure [3]- one for each source. When 
the relay receives a packet from A (B), its dedicated thread 
wakes up and extracts the generation number from the received 
packet. It then searches inside the buffer dedicated to B (A) 
for a packet with the same generation ID. If there is such 
a packet, the relay node combines the two packets, signals 
the sending thread, and hands the resulting packet to it to 
broadcast. If there is no packet with the same generation ID 
as the received packet, it is saved in a buffer dedicated to A. 
If the buffer is full, the oldest packet would be fetched out, 
handed to the sending thread for broadcast, and replaced with 
the new packet. 

As depicted in Figure |9] the three threads at the relay node 
share the sending buffer. Whenever a packet is ready to 
be sent, it is placed in the sending buffer. If the buffer 
is full, the oldest packet would be overwritten. The sending 
thread always monitors the sending buffer; while the 
buffer is not empty, the sending thread fetches the oldest 
packet and sends it to the PHY layer. 

B. NC at PHY layer 

1 ) Frame Structure: As shown in Figure lldfl the physical 
layer is based on 11.52 ms duration frames, comprising of 
beacon and data components. The beacon consists of two 
identical band limited pseudo random signals, mainly used for 
frame synchronization and frequency offset estimation. The 
data signal is composed of 220 OFDM symbols. Each OFDM 
symbol includes 256 subcarriers, but only 198 subcarriers is 
used, which includes 168 data subcarriers, 12 continuation 
pilots and 18 scattered pilots. The three types of signals are 
transmitted over the OFDM time-frequency grid as in Figure 
nn The information for physical layer such as modulation 
and coding scheme and the type of network coding are 
included in continuation pilots. The continuation pilots can 
also serve for frequency synchronization and phase tracking. 
The rectangularly distributed scattered pilots serve as reference 
for channel estimation, and the power of continuation pilots 
and scattered pilots are set to be 3 dB higher than data signal. 

'Note that a basic OFDM PHY is implemented here to support the 
experiments, which can be extended to the PHY frame in 802.11a. 
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Fig. 10. PHY symbol frame structure in OFDM system 
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Fig. 1 1 . Signal types in time-frequency grid 



Table H] summarizes the detailed PHY parameters that supports 
BPSK, QPSK, and 8-PSK modulation schemes for different 
relay scenarios. 

2) Packet Relaying Processing: Since the DF-JM is em- 
ployed in the relay node, we extend the OFDM PHY with 
DF-JM scheme; the associated signal processing for the trans- 
mitter and receiver is shown in Figure [12] At the third time 
period when the relay node becomes a transmitter, two LDPC 
encoders will be used to encode the information from the two 
sources. Based on channel condition, three types of modulation 
are considered in this paper: BPSK, QPSK and 8-PSK packets. 
For symmetric relaying, nodes A and B transmit the BPSK 
packet to the relay node R, and then QPSK packet can be 
transmitted by the node R using the DF-JM scheme. When 
the node R is placed near by node B, the link between B and 
R can support QPSK transmission, allowing the node R to 
transmit the 8-PSK packet based on DF-JM scheme in the third 
time slot. For comparison, optimal labeling and Gray mapping 
for QPSK and 8-PSK packets are introduced at the node R 
and the labeUng information carried by the continuation pilots. 



TABLE I 
OFDM PHY PARAMETERS 



Parameter 


Value 


Frequency band 


2.422 GHz 


signal bandwidth 


4.254 MHz 


subcarrier spacing 


21.484 KHz 


Symbol duration (data) 


46.546 fis 


Guard interval duration 


5.8182 us 


Frame length 


11.52 ms 


LDPC length 


9216 


Code rate 


1/2 



Optimal constellation 
labeling mapping 
(a). Transmitter at relay node 



LDPC 
decoder 



demapper 



OFDM 
demodulator 



Side information : w, 
(b). Receiver at destination i 

Fig. 12. Signal processing in the relay node and destinations (source/sink 
nodes) for PHY layer NC 



IV. Experimental Results 
A. Sora Platform 

Software defined radios (SDR) have been attracting increas- 
ing attention recently. In an SDR system, components that are 
typically implemented in hardware (e.g., mixers, filters, am- 
plifiers, modulators/demodulators, detectors, etc.) are instead 
implemented by means of software on a suitable hardware 
platform ||29l . Changing a component implemented in soft- 
ware is easier and faster, leading to consequent flexibility in 
modifying a communication system built on a SDR. 

Microsoft's Software Defined Radio platform (SORA) con- 
sists of three fundamental components: ll24l : 

> RF front-end 

. Radio Control Board (RCB) 

> SDR application driver 

Figure [13] illustrates the SORA architecture. The RF front- 
end represents the well-defined interface between the digital 
and analog domains. It contains analog-to-digital (A/D) and 
digital-to-analog (D/A) converters, and necessary circuitry for 
radio transmission. Since all signal processing is done by the 
software, the RF front-end design is rather generic. The RF 
front-end in SORA is interchangeable; in this work, we use 
the WARP radio boai'd ll30ll . 

The RCB is the new PC interface board for establishing a 
high-throughput, low-latency path for transferring high-fidelity 
digital signals between the RF front-end and PC memory. 
To achieve high system throughput, RCB uses a high-speed, 
low-latency PCIe bus, which supports a throughput of up to 
16.7 Gbps with x8 model. The large on-board memory further 
allows caching pre-computed waveforms, adding additional 
flexibility for software radio processing. Finally, the RCB pro- 
vides a low-latency control path for software to control the RF 
front-end hardware and to ensure it is properly synchronized 
with the host CPU. 

In SORA, SDR components are located in a separate upper- 
level driver, called an SDR Application Driver, where the 
customized Link layer, MAC and PHY are implemented. An 
SDR application driver accesses the hardware resource via 
the Sora Core API inside the Core Library. The core library 
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Fig. 13. Hardware/Software stmcture in SORA 1241 



implements the common functions of radio and ottier liardware 
resource management, like radio board configuration and 
control. Specially, it provides the necessary system services 
and programming support for implementing wireless PHY in a 
general-purpose multicore processor. Several processing tasks 
with intensive computation complexity, such as the down- 
sampling. Fast Fourier Transform (FFT) and LDPC decoder, 
are optimized by taking advantage of the data-parallelism with 
the Single Instruction Multiple Data (SIMD) instruction sets 
of the CPU. 

B. NC at MAC layer: Experiments 

We use three i7 personal computers supporting PCIe. Each 
computer is equipped with a SORA board and a warp radio 
board as the RF front-end. To remove any hidden node 
problem, we place each node on a vertex of an equilateral 
triangle with a side length of 1 m 0. In this configuration, 
every node is in coverage range of the others. However, source 
nodes only accept packets from the relay node. 

Each node is performing in 802.11a basic mode with a 
transmission rate of 1 Mbps. Each source contains a 135 
MB file aiming to send it to the other one. Packets are 
I KB long; i.e., it takes roughly 8 ms to send a packet. 
Length of generation ID is 32-bit, which is a negligible 
overhead compared to the length of the packets. For the sake 
of simplicity, in this work, the same amount of memory is 
assigned to every queue in the system architecture discussed 
in Section IIII-AI In other words, all queues would have the 
same size in each experiment. Having different sizes for each 
queue and observing the role that plays on system performance 
is defeiTed to future work. 

Figure [14] compares packet loss rates in the relay network 
with and without network coding. As we discussed in Section 
IIII-AI using network coding increases the likelihood of losing 
packets. A solution to this problem would be increasing the 
size of the buffer. As one can see, having a large enough queue 

^Estimating the impact of hidden node is out of scope of this paper. 



I With Network Coding 
I Without Networl< Coding 




Fig. 14. Average packet loss rate in two-way relay on 802.11 framework 
with and without NC at MAC layer 




Fig. 15. Average delay in two-way relay on 802.11 framework with and 
without NC at MAC layer Delay axis has been scaled such that the packets 
are 1 second long (equivalently 1Mbit long). 



would overcome any packet loss caused by MAC layer NC. 
However, increasing queue size, as Figure [15] shows, would 
increase the average delay in the system. In this instance, we 
define delay as amount of time needed for a packet to traverse 
from one source to the other. In this figure, we normalize the 
y-axis such that a packet length is 1 s. At the cost of increasing 
the incidence of packet loss and a little more computational 
complexity at the nodes, MAC layer NC decreases packet 
delay at low queue size. If one increases size of the queue 
to decrease system packet loss, that cancels out the advantage 
of shorter packet delay as in Figure [15] 

System throughput (packet/s/node) is depicted in Figure 
[T6] where throughput is the average number of successfully 
received packets at each node per second. Comparing to 
baseline TDMA relay protocol (4-step without network cod- 
ing), NC at MAC layer increases the throughput by about 
20 — 30%. While system throughput for applying NC to 
CSMA/CA MAC is always less than 35 packets/s/node, for 
ideal TDMA-based system, the throughput would be around 41 
packets/s/node. That means, when network coding is applied 
to MAC layer, randomness in capturing the channel causes 
nearly 15% reduction in system throughput compared to an 
ideal deterministic TDMA MAC. 

Finally, to test the sensitivity of MAC layer NC, we change 
one of the nodes' power and keep the other parameters of the 
system untouched. In each experiment, we halve the power of 
node B and measure system throughput and packet loss rate 
for each source/sink in the system, as depicted in Figure [17] 
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Fig. 16. System throughput (per node) in information exchange for a 802.11 
network when MAC layer NC is applied 



Figure [T7}a presents the following results: when the power 
of node B decreases, its packet loss rate increases while the 
packet loss rate of the other node (node A) remains almost 
the same. For MAC layer NC, the relay node needs to receive 
packets from both nodes in order to do the encoding (XOR 
the packets) and broadcast the result. Hence packets from node 
A are buffered until a packet successfully arrives from node 

B. This means delay from A to B increases or, equivalently, 
throughput decreases. As one can see in Figure [T7}b, having 
asymmetry in the system would decrease throughput of node 
A as well. The reason for that can be justified as follows: 
buffer dedicated to node A at the relay node would be filled up 
because of the high packet loss rate from node B. As discussed 
in Section IIII-AI the relay node would start broadcasting 
uncoded packets from node A to prevent buffer overflow. This 
means the sending buffer at the relay usually has some packets 
from A to be sent. Therefore, when a packet is received from 
B and gets encoded by the correspondent packet (i.e. same 
generation) from node A, the coded packet would spend some 
time in relay's sending buffer, increasing the delay from B to 
A and degrading throughput at A. 

C. NC at PHY layer 

As our experience in the previous subsection shows, unequal 
packet loss rate would decrease system performance. PHY 
layer NC is proposed to deal with any asymmetric traffic. In 
this subsection, the performance of the demonstration platform 
with the novel PHY prototype is shown. In the test set- 
up, the platform was implemented on three PCs equipped 
with external antennas. By adjusting the transmit power and 
the distance between transmit and receive antenna pairs, the 
channel condition can be changed for different scenarios. 

In Figures [18] and [19] we illustrate the received signal after 
channel equalization at node A for the two-way symmetric 
and asymmetric traffic. The end-to-end throughput for both 
symmetric and asymmetric traffic corresponding to different 
network coding schemes is presented in Figure [20] It is 
observed that irrespective of symmetric or asymmetric traffic, a 
three-step information exchange scheme based on DF-JM with 
optimal labeling significantly improves the network through- 
put, compared to four-step information exchange scheme. It 
is noteworthy that the optimal labeling provides a nearly 
100% gain over the Gray mapping at SNR around 7dB for 




Fig. 17. reducing transmission power at node B to measure (a) packet loss 
rate (b) throughput of each node when MAC layer network coding is used 







-0.5 0.5 

In-Phase 



Fig. 18. The received signal after channel equalization in A for symmetric 
relaying, SNR =16.0632 dB 



asymmetric relaying, but does not have the same through- 
put improvement for symmetric relaying. This behavior is 
reasonable because the SNR=7dB is enough to support the 
transmission of QPSK signal without decoding error. 

V. Conclusion 

We modified 802.11 MAC to support network coding on 
SORA platform from Microsoft, and conducted experiments 
to measure system performance (delay, packet loss rate and 
throughput) in a three node with relay network. Our results 
showed MAC layer NC improves system throughput by twenty 
to thirty percent but with the cost of increasing packet loss rate. 

Moreover, we explored one of the fundamental problems 
with MAC layer NC - its known sensitivity to symmetric 
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Fig. 19. The received signal after channel equalization in A for asymmetric 
relaying, SNR =16.4238 dB 




A->B for both traffics B->A for symmetric traffic B->A for asymmetric traffic 
Flow 

Fig. 20. The end-to-end throughput (per flow) in information exchange for 
an OFDM-based network when PHY layer NC is applied 

traffic. System throughput dramatically drops when there the 
system supporting MAC layer NC is not symmetric. We 
explained how PHY layer NC can be used to overcome 
this problem. We designed and implemented an OFDM-based 
PHY (similar to 802.11a PHY) to support NC. Our lab-scale 
experiment on SORA showed that for asymmetric traffic, our 
implemented PHY layer NC increases throughput by thirty 
percent comparing to traditional 4-step TDMA exchange. 
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